40 research outputs found

    Off-Policy Evaluation of Probabilistic Identity Data in Lookalike Modeling

    Full text link
    We evaluate the impact of probabilistically-constructed digital identity data collected from Sep. to Dec. 2017 (approx.), in the context of Lookalike-targeted campaigns. The backbone of this study is a large set of probabilistically-constructed "identities", represented as small bags of cookies and mobile ad identifiers with associated metadata, that are likely all owned by the same underlying user. The identity data allows to generate "identity-based", rather than "identifier-based", user models, giving a fuller picture of the interests of the users underlying the identifiers. We employ off-policy techniques to evaluate the potential of identity-powered lookalike models without incurring the risk of allowing untested models to direct large amounts of ad spend or the large cost of performing A/B tests. We add to historical work on off-policy evaluation by noting a significant type of "finite-sample bias" that occurs for studies combining modestly-sized datasets and evaluation metrics involving rare events (e.g., conversions). We illustrate this bias using a simulation study that later informs the handling of inverse propensity weights in our analyses on real data. We demonstrate significant lift in identity-powered lookalikes versus an identity-ignorant baseline: on average ~70% lift in conversion rate. This rises to factors of ~(4-32)x for identifiers having little data themselves, but that can be inferred to belong to users with substantial data to aggregate across identifiers. This implies that identity-powered user modeling is especially important in the context of identifiers having very short lifespans (i.e., frequently churned cookies). Our work motivates and informs the use of probabilistically-constructed identities in marketing. It also deepens the canon of examples in which off-policy learning has been employed to evaluate the complex systems of the internet economy.Comment: Accepted by WSDM 201

    Rice Calcineurin B-Like Protein-Interacting Protein Kinase 31 (OsCIPK31) Is Involved in the Development of Panicle Apical Spikelets

    Get PDF
    Panicle apical abortion (PAA) causes severe yield losses in rice production, but details about its development and molecular basis remain elusive. Herein, a PAA mutant, paa1019, was identified among the progeny of an elite indica maintainer rice line Yixiang 1B (YXB) mutagenized population obtained using ethyl methyl sulfonate. The abortion rate of spikelets in paa1019 was observed up to 60%. Genetic mapping combined with Mutmap analysis revealed that LOC_Os03g20380 harbored a single-bp substitution (C to T) that altered its transcript length. This gene encodes calcineurin B-like protein-interacting protein kinase 31 (OsCIPK31) localized into the cytoplasm, and is preferentially expressed in transport tissues of rice. Complementation of paa1019 by transferring the open reading frame of LOC_Os03g20380 from YXB reversed the mutant phenotype, and conversely, gene editing by knocking out of OsCIPK31 in YXB results in PAA phenotype. Our results support that OsCIPK31 plays an important role in panicle development. We found that dysregulation is caused by the disruption of OsCIPK31 function due to excessive accumulation of ROS, which ultimately leads to cell death in rice panicle. OsCIPK31 and MAPK pathway might have a synergistic effect to lead ROS accumulation in response to stresses. Meanwhile the PAA distribution is related to IAA hormone accumulation in the panicle. Our study provides an understanding of the role of OsCIPK31 in panicle development by responding to various stresses and phytohormones

    Rapid Visual Detection of High Nitrogen-Use Efficiency Gene <i>OsGRF4</i> in Rice (<i>Oryza sativa</i> L.) Using Loop-Mediated Isothermal Amplification Method

    No full text
    The GROWTH-REGULATING FACTOR4 (OsGRF4) allele is an important target for the development of new high nitrogen-use efficiency (NUE) rice lines that would require less fertilizers. Detection of OsGRF4 through PCR (polymerase chain reaction)-based assay is cumbersome and needs advanced laboratory skills and facilities. Hence, a method for conveniently and rapidly detecting OsGRF4 on-field is a key requirement for further research and applications. In this study, we employed cleaved amplified polymorphic sequences (CAPs) and loop-mediated isothermal amplification (LAMP) techniques to develop a convenient visual detection method for high NUE gene OsGRF4NM73 (OsGRF4 from the rice line NM73). The TC→AA mutation at 1187–1188 bp loci was selected as the target sequence for the OsGRF4NM73 allele. We further employed this method of identification in 10 rice varieties that carried the OsGRF4 gene and results revealed that one variety (NM73) carries the target OsGRF4NM73 allele, while other varieties did not possess the osgrf4 genotype. The optimal LAMP reaction using hydroxynaphthol blue (HNB), a chromogenic indicator, was carried out at 65 °C for 60 min, and the presence of OsGRF4NM73 allele was confirmed by color changes from violet to sky blue. The results of this study showed that the LAMP method can be conveniently and accurately used to detect the OsGRF4NM73 gene in rice

    Testing Rare-Variant Association without Calling Genotypes Allows for Systematic Differences in Sequencing between Cases and Controls

    No full text
    <div><p>Next-generation sequencing of DNA provides an unprecedented opportunity to discover rare genetic variants associated with complex diseases and traits. However, the common practice of first calling underlying genotypes and then treating the called values as known is prone to false positive findings, especially when genotyping errors are systematically different between cases and controls. This happens whenever cases and controls are sequenced at different depths, on different platforms, or in different batches. In this article, we provide a likelihood-based approach to testing rare variant associations that directly models sequencing reads without calling genotypes. We consider the (weighted) burden test statistic, which is the (weighted) sum of the score statistic for assessing effects of individual variants on the trait of interest. Because variant locations are unknown, we develop a simple, computationally efficient screening algorithm to estimate the loci that are variants. Because our burden statistic may not have mean zero after screening, we develop a novel bootstrap procedure for assessing the significance of the burden statistic. We demonstrate through extensive simulation studies that the proposed tests are robust to a wide range of differential sequencing qualities between cases and controls, and are at least as powerful as the standard genotype calling approach when the latter controls type I error. An application to the UK10K data reveals novel rare variants in gene <i>BTBD18</i> associated with childhood onset obesity. The relevant software is freely available.</p></div

    Type I error of the weighted burden test at the nominal significance level of 0.01.

    No full text
    <p>Type I error of the weighted burden test at the nominal significance level of 0.01.</p

    Top ten genes for childhood onset obesity identified by New-STB using damaging variants in the analysis of the UK10K data.

    No full text
    <p>Top ten genes for childhood onset obesity identified by New-STB using damaging variants in the analysis of the UK10K data.</p
    corecore